New quaternion in HLSL #960

keptsecret · 2025-12-02T10:14:17Z

No description provided.

devshgraphicsprogramming · 2025-12-03T08:51:16Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    static this_t create(scalar_type x, scalar_type y, scalar_type z, scalar_type w)
+    {
+        this_t q;
+        q.data = data_type(x, y, z, w);
+        return q;
+    }


imho this is more painful than just

math::quaternion q; q.data = vector_of_xyzw;

delete this pseudo-constructor

devshgraphicsprogramming · 2025-12-03T08:51:39Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    // angle: Rotation angle expressed in radians.
+    // axis: Rotation axis, must be normalized.
+    static this_t create(scalar_type angle, const vector3_type axis)


can we have axis first, then angle ?

btw you can also throw in a uniformScale = 1.f into the mix :P

devshgraphicsprogramming · 2025-12-03T08:52:15Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    static this_t create(NBL_CONST_REF_ARG(this_t) other)
+    {
+        return other;
+    }


we don't need this

devshgraphicsprogramming · 2025-12-03T08:54:21Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    static this_t create()
+    {
+        this_t q;
+        q.data = data_type(0.0, 0.0, 0.0, 1.0);
+        return q;
+    }


i think the create overloads need to have their argument types templated with NBL_FUNC_REQUIRES because DXC will screw us over with implicit conversions

devshgraphicsprogramming · 2025-12-03T09:00:50Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    static this_t create(scalar_type pitch, scalar_type yaw, scalar_type roll)
+    {
+        const scalar_type rollDiv2 = roll * scalar_type(0.5);
+        const scalar_type sr = hlsl::sin(rollDiv2);
+        const scalar_type cr = hlsl::cos(rollDiv2);
+
+        const scalar_type pitchDiv2 = pitch * scalar_type(0.5);
+        const scalar_type sp = hlsl::sin(pitchDiv2);
+        const scalar_type cp = hlsl::cos(pitchDiv2);
+
+        const scalar_type yawDiv2 = yaw * scalar_type(0.5);
+        const scalar_type sy = hlsl::sin(yawDiv2);
+        const scalar_type cy = hlsl::cos(yawDiv2);
+
+        this_t output;
+        output.data[0] = cr * sp * cy + sr * cp * sy; // x
+        output.data[1] = cr * cp * sy - sr * sp * cy; // y
+        output.data[2] = sr * cp * cy - cr * sp * sy; // z
+        output.data[3] = cr * cp * cy + sr * sp * sy; // w
+
+        return output;
+    }


I'd break this down into two functions

template<typename T=vector<scalar_type,2> > requires is_same_v<vector<scalar_type,2>,T> static this_t create(const T halfPitchCosSin, const T halfYawCosSin, const T halfRollCosSin); template<typename T=scalar_type> requires is_same_v<scalar_type,T> static this_t create(const T pitch, const T yaw, const T roll);

devshgraphicsprogramming · 2025-12-03T09:05:35Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    static this_t create(NBL_CONST_REF_ARG(matrix_type) m)
+    {
+        const scalar_type m00 = m[0][0], m11 = m[1][1], m22 = m[2][2];
+        const scalar_type neg_m00 = bit_cast<scalar_type>(bit_cast<AsUint>(m00)^0x80000000u);
+        const scalar_type neg_m11 = bit_cast<scalar_type>(bit_cast<AsUint>(m11)^0x80000000u);
+        const scalar_type neg_m22 = bit_cast<scalar_type>(bit_cast<AsUint>(m22)^0x80000000u);
+        const data_type Qx = data_type(m00, m00, neg_m00, neg_m00);
+        const data_type Qy = data_type(m11, neg_m11, m11, neg_m11);
+        const data_type Qz = data_type(m22, neg_m22, neg_m22, m22);
+
+        const data_type tmp = hlsl::promote<data_type>(1.0) + Qx + Qy + Qz;
+        const data_type invscales = hlsl::promote<data_type>(0.5) / hlsl::sqrt(tmp);
+        const data_type scales = tmp * invscales * hlsl::promote<data_type>(0.5);
+
+        // TODO: speed this up
+        this_t retval;
+        if (tmp.x > scalar_type(0.0))
+        {
+            retval.data.x = (m[2][1] - m[1][2]) * invscales.x;
+            retval.data.y = (m[0][2] - m[2][0]) * invscales.x;
+            retval.data.z = (m[1][0] - m[0][1]) * invscales.x;
+            retval.data.w = scales.x;
+        }
+        else
+        {
+            if (tmp.y > scalar_type(0.0))
+            {
+                retval.data.x = scales.y;
+                retval.data.y = (m[0][1] + m[1][0]) * invscales.y;
+                retval.data.z = (m[2][0] + m[0][2]) * invscales.y;
+                retval.data.w = (m[2][1] - m[1][2]) * invscales.y;
+            }
+            else if (tmp.z > scalar_type(0.0))
+            {
+                retval.data.x = (m[0][1] + m[1][0]) * invscales.z;
+                retval.data.y = scales.z;
+                retval.data.z = (m[0][2] - m[2][0]) * invscales.z;
+                retval.data.w = (m[1][2] + m[2][1]) * invscales.z;
+            }
+            else
+            {
+                retval.data.x = (m[0][2] + m[2][0]) * invscales.w;
+                retval.data.y = (m[1][2] + m[2][1]) * invscales.w;
+                retval.data.z = scales.w;
+                retval.data.w = (m[1][0] - m[0][1]) * invscales.w;
+            }
+        }
+
+        retval.data = hlsl::normalize(retval.data);
+        return retval;
+    }


only an orthogonal (no skew) and uniform scale matrix can be converted to a quaternion.

so dot products between the rows and columns should be 0, and the dot product of each column with itself should have the same length

You can write the whole algorithm assuming this, so invscales would be a scalar.

You could assert the orthogonality (no skew) and uniform scaling property OR take an extra const bool dontAssertValidMatrix=false where instead of asserting you return a NaN filled quaternion if the matrix is not valid

devshgraphicsprogramming · 2025-12-03T09:06:06Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    static this_t create(NBL_CONST_REF_ARG(truncated_quaternion<T>) first3Components)
+    {
+        this_t retval;
+        retval.data.xyz = first3Components.data;
+        retval.data.w = hlsl::sqrt(scalar_type(1.0) - hlsl::dot(first3Components.data, first3Components.data));
+        return retval;
+    }


put this wholly inside the static_cast from truncated to full quaternion

devshgraphicsprogramming · 2025-12-03T09:14:43Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+        const AsUint negationMask = hlsl::bit_cast<AsUint>(totalPseudoAngle) & AsUint(0x80000000u);
+        const data_type adjEnd = hlsl::bit_cast<scalar_type>(hlsl::bit_cast<AsUint>(end.data) ^ negationMask);
+
+        this_t retval;
+        retval.data = hlsl::mix(start.data, adjEnd, fraction);


Please unit test/check it somehow (re-derive) I'm not sure that when totalPseudoAngle<0 the result is meant to be S+(-E-S)*fraction and not `S+(E-S)*(-fraction)

P.S. btw my recent learnings about GPU architectures have made me aware that there's more FP32 cores than INT32 on Nvidia so write a comment to benchmark the sign xor flip against just *sign(totalPseudoAngle)

devshgraphicsprogramming · 2025-12-03T09:17:31Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    static this_t lerp(const this_t start, const this_t end, const scalar_type fraction)
+    {
+        return lerp(start, end, fraction, hlsl::dot(start.data, end.data));
+    }
+
+    static scalar_type __adj_interpolant(const scalar_type angle, const scalar_type fraction, const scalar_type interpolantPrecalcTerm2, const scalar_type interpolantPrecalcTerm3)
+    {
+        const scalar_type A = scalar_type(1.0904) + angle * (scalar_type(-3.2452) + angle * (scalar_type(3.55645) - angle * scalar_type(1.43519)));
+        const scalar_type B = scalar_type(0.848013) + angle * (scalar_type(-1.06021) + angle * scalar_type(0.215638));
+        const scalar_type k = A * interpolantPrecalcTerm2 + B;
+        return fraction + interpolantPrecalcTerm3 * k;
+    }
+
+    static this_t flerp(const this_t start, const this_t end, const scalar_type fraction)
+    {
+        const scalar_type pseudoAngle = hlsl::dot(start.data,end.data);
+        const scalar_type interpolantPrecalcTerm = fraction - scalar_type(0.5);
+        const scalar_type interpolantPrecalcTerm3 = fraction * interpolantPrecalcTerm * (fraction - scalar_type(1.0));
+        const scalar_type adjFrac = __adj_interpolant(hlsl::abs(pseudoAngle),fraction,interpolantPrecalcTerm*interpolantPrecalcTerm,interpolantPrecalcTerm3);
+
+        this_t retval = lerp(start,end,adjFrac,pseudoAngle);
+        retval.data = hlsl::normalize(retval.data);
+        return retval;
+    }


write a comment that lerp does not resepect scales, can stick an assert in lerp that start and end are both unit length

also make behaviour consistent, either lerp and flerp both normalize or both don't normalize the result (in case you go with normalization, make unnormLerp and unnormFlerp)

devshgraphicsprogramming · 2025-12-03T09:17:58Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    vector3_type transformVector(const vector3_type v)
+    {
+        scalar_type scale = hlsl::length(data);
+        vector3_type direction = data.xyz;
+        return v * scale + hlsl::cross(direction, v * data.w + hlsl::cross(direction, v)) * scalar_type(2.0);
+    }


add a const bool assumeNoScale=false argument

devshgraphicsprogramming · 2025-12-03T09:20:24Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+        retval.data.x = bit_cast<scalar_type>(bit_cast<AsUint>(data.x)^0x80000000u);
+        retval.data.y = bit_cast<scalar_type>(bit_cast<AsUint>(data.y)^0x80000000u);
+        retval.data.z = bit_cast<scalar_type>(bit_cast<AsUint>(data.z)^0x80000000u);
+        retval.data.w = data.w;


retval.data.xyz = -retval.data.xyz is more readable and compiler should get it

devshgraphicsprogramming · 2025-12-03T09:20:57Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+    static this_t normalize(NBL_CONST_REF_ARG(this_t) q)
+    {
+        this_t retval;
+        retval.data = hlsl::normalize(q.data);
+        return retval;
+    }


specialize normalize_helper from intrinsics instead

devshgraphicsprogramming · 2025-12-03T09:22:06Z

include/nbl/builtin/hlsl/math/quaternions.hlsl

+        mat[1][1] = scalar_type(0.5) - mat[1][1];
+        mat[2][2] = scalar_type(0.5) - mat[2][2];
+        mat *= scalar_type(2.0);
+        return hlsl::transpose(mat);    // TODO: double check transpose?


yes please double check the tranpose, best add some unit tests to example 22 where random vectors get rotated by random quaternions, and you check the result is same as making a mat3 out of the quaternion and mul(matrix,inputVector)

@Przemog1 can advise/help

sure, also example 22 doesn't compile on master currently, for now just comment out the code that doesn't work

split out new quaternion hlsl stuff over from hlsl path tracer example

781df86

devshgraphicsprogramming reviewed Dec 3, 2025

View reviewed changes

devshgraphicsprogramming mentioned this pull request Dec 3, 2025

Remove core::matrixSIMD #950

Open

New quaternion in HLSL #960

Are you sure you want to change the base?

New quaternion in HLSL #960

Uh oh!

Conversation

keptsecret commented Dec 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants